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Abstract: This paper studies the fluctuations of the signal-to-noise ratio 
(SNR) of minimum variance distorsionless response (MVDR) filters imple- 
menting diagonal loading in the estimation of the covariance matrix. Previ- 
ous results in the signal processing literature are generalized and extended 
by considering both spatially as well as temporarily correlated samples. 
Specifically, a central limit theorem (CLT) is established for the fluctua- 
tions of the SNR of the diagonally loaded MVDR filter, under both super- 
vised and unsupervised training settings in adaptive filtering applications. 
Our second-order analysis is based on the Nash-Poincare inequality and the 
integration by parts formula for Gaussian functionals, as well as classical 
tools from statistical asymptotic theory. Numerical evaluations validating 
the accuracy of the CLT confirm the asymptotic Gaussianity of the fluctu- 
ations of the SNR of the MVDR filter. 
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1. Introduction 

The minimum variance distorsionless response (MVDR) filter is a prominent 
instance of multivariate filtering structure in statistical signal processing. Re- 
garded as Capon beamformer, the MVDR spatial filter is widely utilized in sen- 
sor array signal processing applications, such as the estimation of the waveform 
and/or power of a given signal of interest (SOI) [1, 2]. The theoretically optimal 
Capon/MVDR spatial filter is constructed based on a covariance matrix that 
is unknown in practice, and so any filter implementation must rely on sample 
estimates computed from the array observations available. Sample covariance 
estimators are well-known to be prohibitively inaccurate for sample volumes of 
small size, relatively high dimension. Indeed, a vast body of contributions in 
the literature of array processing and other fields of applied statistics has been 
devoted to remedies for lifting the curse of dimensionality, such as those based 
on regularization techniques and shrinkage estimation. 

In this work, we are interested in the signal-to-noise ratio (SNR) at the output 
of MVDR filter realizations using a diagonally loaded sample covariance matrix 
(SCM). We focus on the SNR as a measure conventionally used to evaluate the 
performance of a filter implementation. Due to its dependence on the sample 
data matrix, the SNR is itself a random variable whose behavior highly depends 
on the ratio between sample size and observation dimension. This ratio is indeed 
of much practical relevance for characterizing the properties of the filter per- 
formance. Motivated by this fact, a large-system performance characterization 
was presented in [3, Proposition 1], where the authors provide a deterministic 
equivalent of the output SNR in the limiting regime defined by both the number 
of samples and the observation dimension growing large without bound at the 
same rate (see also [4]). 

A first-order asymptotic analysis precludes us from gaining any insight on the 
fluctuations of the SNR performance measure. Therefore, our focus in this work 
is on a second-order analysis of the previous quantity. In the case of Gaussian 
observations, when the maximum likelihood estimator of the population covari- 
ance matrix is applied without diagonal loading, the normalized output SNR 
is known in the array processing literature to follow a Beta distribution [5]. In 
the general and more relevant case for practical implementations considering the 
application of diagonal loading, the problem of characterizing the distribution of 
the previous random variable remains unsolved. Earlier attempts focused on the 
output response of the classical diagonally loaded Capon/MVDR beamformer, 
by approximating its probability density function via the truncation of a matrix 
power series [6] (see also introductory exposition therein for details on previous 
related work), and for the particular cases of zero- and single-source scenarios 
[7], as well as a two-source scenario [8]. 

In this paper, we generalize previous studies by considering both the use of 
diagonal loading as well as general spatio-temporally correlated observations. 
Specifically, we prove the asymptotic Gaussianity of the sample performance 
measure by establishing a central limit theorem (CLT) on the output SNR of 
a diagonally loaded MVDR filter implementation. To that effect, we resort to 
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a set of techniques for Gaussian random matrices, namely the Nash-Poincare 
inequality as well as the integration by parts formula for Gaussian function- 
als. These tools were originally proposed in [9] for the study of the asymptotic 
distribution of the mutual information of correlated MIMO Rayleigh channels. 
More recently, they have also been applied, for instance, to obtain asymptotic 
non-Gaussian approximations of the distribution of the SNR of the linear min- 
imum mean-square error (LMMSE) receiver [10], as well as to derive the input 
covariancc matrix maximizing the capacity of correlated MIMO Rician channels 
[11]. 

Our framework relies on a limiting regime defined as both dimensions of 
the data matrix going to infinity at the same rate. Indeed, in real-life array 
processing applications, both the number of samples and the dimension of the 
array are comparable in magnitude, and so a limiting regime allowing for both 
sample size and dimension growing large with a fixed, non-zero ratio between 
them is of more practical relevance. We will consider both supervised and un- 
supervised training methods in statistical signal and sensor array processing 
applications (see, e.g., [12, 13]). In the former, access to SOI- free samples of 
the interference-plus-noise process is granted for covariance matrix estimation 
(e.g., clutter statistics in space-time adaptive processing applications to radar), 
whereas only SOI-contaminated samples are available for inference in the latter. 

The structure of the rest of the paper after the previous exposition of the re- 
search motivation is as follows. Upon concluding this section by introducing the 
notation that will be used throughout the paper, Section 2 briefly presents the 
problem of multivariate minimum variance filtering; the typical implementation 
based on a diagonally loaded sample covariance matrix (SCM) is introduced 
along with the definition of SNR as performance measure of relevance. In Sec- 
tion 3 we establish the CLT for the fluctuations of the SNR performance of both 
supervised and unsupervised training methods. In Section 4, we introduce the 
main mathematical tools for our analysis and state some preliminary results 
serving as preparation for the proof of the CLT. Our result on the asymptotic 
Gaussianity of SNR measures is numerically validated in Section 6, before con- 
cluding the paper with Section 7. The technical details of the proof of the CLT 
in Section 4 arc postponed to the appendices. 

Notation. In this paper, we use the following notations. All vectors are 
defined as column vectors and designated with bold lower case; all matrices are 
given in bold upper case; for both vectors and matrices a subscript will be added 
to emphasize dependence on dimension, though it will be occasionally dropped 
for the sake of clarity of presentation; [-]y will be used with matrices to extract 
the entry in the ith row of the jth column, [•] ■ will be used for the jth entry of a 

vector or the nonzero elements of a diagonal matrix; (-) T denotes transpose; (•)* 
denotes Hermitian (i.e. complex conjugate transpose); 1m denotes the M x M 
identity matrix; tr [•] denotes the matrix trace operator; R and C denote the real 
and complex fields of dimension specified by a superscript; K + denotes the set 
of positive real numbers; P (•) denotes the probability of a random event, E [■] 
denotes the expectation operator, and var(-) and cov(-,-) denote, respectively, 
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variance and covariancc; K, K p denote constant values not depending on any 
relevant quantity, apart from the latter on a parameter p; | • | denotes absolute 
value; for any two functions Jn,9n depending on N, /j\r = O (gN) will denote 
the fact that \fjy\ < K \giy\, for sufficiently large N, and /at = o p (1) will 
denote convergence in probability to zero of /at; ||-|| denotes the Euclidean norm 
for vectors and the induced norm for matrices (i.e. spectral or strong norm), 
whereas and ||-|| tr denote the Frobenius norm and trace (or nuclear) norm, 



respectively, i.e., for a matrix A € 
and spectral radius p (A) = maxi< m <M 



MxM with eigenvalues A m ,m = 
(|A m |), ||A|| = (p(A*A)) 1/2 , 



1. 



,M 



(tr [A* A]) 1/2 and ||A 



tr 



;a*a) 



1/2 



2. MVDR filtering with diagonal loading 



In this section, we introduce the signal model and briefly review the problem 
of spatial or multivariate MVDR filtering motivating our research. Let ~Yp,N — 
[y,s (1) > • • • ) Y/3 (-W)] be the data matrix with sample observations in a statistical 
signal processing application, where the parameter (3 indicates presence (/3 = 1) 
or not (/? = 0) of the SOI in the observations, which are modeled as: 

yp (n) = Ps(ri)s + n(n) G C A/ , 1 < n < N (2.1) 

where s (n) is the waveform process of a given SOI, the vector s models the 
SOI signature, and n (n) represents the contribution from some colored inter- 
ference and the cross-sectionally uncorrelated background noise, which we model 
jointly as a zero-mean Gaussian process with covariance matrix Ro,m- Signal 
and interference-plus-noise processes are assumed to be independent. Addition- 
ally, without loss of generality we will assume that the SOI power is 1, and also 
that ||s|| = 1. In particular, we consider applications relying on supervised train- 
ing, where Y^ jv = Yo.jv contains SOI-free samples of the intcrfercnce-plus-noise 
process, or unsupervised training, where the training samples in Y^jv = Y^jv 
are contaminated by the SOI. Notice that each observation yp (n) might be 
modeling the matched filter output sufficient statistic for the received unknown 
symbols s (n) at a multiuser detector in a communications application, where 
s is the effective user signature; or an array processor, where s contains the 
angular frequency information (steering vector) related to the intended source, 
represented by s (n). 

In order to allow for a more general signal modeling context, we consider the 
case in which the vector observations are not only spatially or cross-sectionally 
correlated but also present a certain correlation in the time domain. This is 
typically the case in array processing applications where the sources exhibit 
nonzero correlation between delayed samples [14], as well as generally for wire- 
less communication signals that are transmitted over a dispersive radio channel. 
In this work, we consider spatio-temporal processes with separable covariance 
structure, also regarded as having Kronccker product structure, and thoroughly 
studied in the literature on multiple-input multiple-output wireless communi- 
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cov 



cation channels [15], and sensor array and multichannel processing [16]. In par- 
ticular, the spatial covariance matrix will be denoted by R^a/, and the time 
correlation pattern will be modeled by a nonnegative matrix denoted by TV, 
so that the column vectors of Yjy are correlated (in the time domain) but 
the correlation pattern is identical for all rows. Notice that the spatial covari- 
ance matrix R^.m is intrinsically different depending on the type of training, 
i.e., R,3.a/ = Ro,m for supervised training, and R^.a/ = Ri,m = ss* + Ro.a/ 
for unsupervised training. As an illustrative example, consider the following 

1 /2 

first-order vector autoregressive process: (n) — ipyp { n ~ 1) + R-a m v ( n )' 
where tp is a real- valued constant and v (n) is a white Gaussian noise process 

1/2 

with zero mean and identity covariance matrix, and R^ M is a square-root of 
a positive matrix R^.a/. In particular, the previous so-called VAR(l) model 
has covariance matrix with separable (Kronecker product) structure given by 

(\yp (n)]. , \y p (n + r)] .) = [R^ <&/ (l - V 2 ) M 
Motivated by typical applications in sensor array signal processing, in this 
paper we concentrate on the problem of linearly filtering the observed samples 
with a Capon/MVDR beamformer to estimate the SOI waveform assuming that 
the SOI signature is known. We notice that a related problem that is not handled 
here but can also be fitted into our framework is that of estimating the SOI power 
[2]. Customarily, the problem of optimizing the coefficients of the Capon/MVDR 
spatial filter is formulated in terms of the spatial covariance matrix as: 

Wfl mvdr = arg min w*R« M w 

weC M :w*s=l 

with explicit solution being given by 

R" 1 s 

W^MVDR — ~ — ] ■ ( Z - Z ) 

Under the above conventional assumptions, the two previous covariance ma- 
trices differ by the rank-one matrix term ss*, and so it is easy to see that the 
optimal solutions with Ro,a/ and Ri,m are equivalent, i.e., w^mvdr = Wi.mvdr- 
Conventionally, the evaluation of the performance of the filter is based on the 
SNR measure, which is defined as 

SNR(w) = ^^ . (2.3) 

w*R ,a/w v ' 

In particular, we have 1 SNR(w 0j mvdr) = SNR(wi i mvdr) = s *R-c7m s = SNR opt . 

In practice, the covariance matrix is unknown and so any implementation 
of the filter must rely on estimates built upon a set of training samples. The 



It is not difficult to see that the maximum SNR values for supervised and unsupervised 
training theoretically coincide. In practice, however, the actual performance of an unsupervised 
training method would be diminished by inaccuracies about the knowledge of the precise SOI 
signature, and therefore a supervised training method is preferred in this sense. 
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standard SCM estimator is usually improved by means of, for instance, regular- 
ization or shrinkage. In particular, we consider covariance matrix estimators of 
the type 

R/3.M = jz Y 0!N Y*^ N + al M (2.4) 

where a > is a constant scalar that in the array processing literature is referred 
to as diagonal loading factor, and is also known in the statistics literature as 
shrinkage intensity parameter for the type of James-Stein shrinkage covariance 
matrix estimators. In brief, the purpose of the regularization term al is to 
improve the condition number of an a priory possibly unstable estimator of the 
covariance matrix of the array observations. This is particularly the case for the 
SCM in situations where TV is not considerably larger than M. Indeed, notice 
that the SCM might not even be invertible, as it happens in the case M > N. 
Well-conditioned covariance matrix estimators can be expected to improve the 
filter performance as measured by the realized SNR defined in (2.3). In this 
work, we assume that the parameter a is given and fixed. For sensible choices 
of the regularization or diagonal loading parameter a, we refer the reader to, 
e.g., [1, 17]. 

We now handle the situation in which a covariance matrix estimator of the 
type of (2.4) is used in order to implement the sample version of the theoretical 
MVDR filter, which will be denoted in the sequel by w^mvdr, = 0, 1. Then, 
using w = w^.mvdr hi (2.3), we obtain, respectively, 



2 

s! 



f s*R _1 

SNR (w ,mvdr) = i-i ( 2 ' 5 ) 



i s * r ;' mRi ' mR 2 i - mS - 



and 



Equations (2.5) and (2.6) are obtained by directly replacing in (2.3) the optimal 
MVDR filter solution in (2.2) for, respectively, the supervised (f3 = 0) and 
unsupervised (/? = 1) cases. Notice that, while the expression in (2.5) follows 
straightforwardly, in order to get (2.6) it is enough to apply the matrix inversion 
lemma using the fact that Ri,m = ss* + Ro,a/- 

In effect, due to the dependence on the random data matrix Y^jv, the quan- 
tities (2.5) and (2.6) are random variables themselves whose distribution specify 
the fluctuations of the SNR performance at the filter output. Consequently, in 
order to understand the behavior of the output SNR performance, it is of much 
practical interest to investigate the distribution of the random variables (2.5) 
and (2.6), and characterize their properties. Under the supervised training set- 
ting, in the special case given by R^m being the standard SCM estimator, i.e., 
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Tjy = Ijv and ol = 0, the distribution of the normalized output SNR, namely, 
SNR(w , MV dr) = ( s *&o.m s ) 

SNR opt S*Ro )Jvf R ,MR-o,Af SS *^M S 

is known to be distributed as [5] 

SNR (w 0: mvdr) /SNR opt ~ Beta (2V + 2 - M, M - 1) . 

In the general, more relevant case for practical implementations, where arbitrary 
positive definite Tjv and a are considered, the problem of characterizing the 
distribution of the random variable SNR (wq.mvdr) remains unsolved. Likewise, 
so is the case for SNR (w^mvdr)- 

In the next section, we provide a CLT on the realized SNR performance 
at the output of a sample MVDR filter implementing diagonal loading and 
based on a set of spatio-temporally correlated observations, for both supervised 
and unsupervised training applications. We remark that in this paper we are 
specifically concerned with the case a > 0. In fact, the case a = has been 
seldom considered in the large random matrix literature, and would require 
indeed specific tools different from those used here. 

3. CLT for the fluctuations of SNR performance measures 
3.1. Definitions and assumptions 

We next summarize our research hypotheses and introduce some new definitions. 
We first remark that, anticipating that the statistical properties of the random 
matrices ~Yp,M and R^m for both values of (3 are equivalent for the purposes 
of our derivations, we will drop the subscript /3 in the sequel. Our analysis is 
based on the following technical hypotheses: 

(Asl) The observations are normally distributed with zero mean and separable 
covariances Rm and Tat in the spatial and time domain respectively. 

(As2) The nonrandom matrices Rm and Tn have eigenvalues bounded uni- 
formly in, respectively, M and N = N(M), from above, i.e., ||R|| sup = 
sup M >! ||Rm|| < +oo and ||T|| sup = sup^^ \\T N \\ < +oo, and from 

below (away from zero): ||R|j inf = iniM>i ||R-a/|| 1 > and ||T|| inf = 

infjv>i llT^lf 1 > 0. 
(As3) We will consider the limiting regime defined by both dimensions M and 
N growing large without bound at the same rate, i.e., 2V, M — > oo such 
that (c M = M/N): 

< Qnf = hrninf cm < c sup = limsupcM < oo. 

Let Xm be an M x N matrix whose elements Xij ,l<i< M, 1 < j < N, are 
complex Gaussian random variables having i.i.d. real and imaginary parts with 
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1. 



mean zero and variance 1/2, such that E [X^] = E [X^] = and E \Xij | 2 
Under the Gaussianity assumption, observe that we can write the data matrix in 

1/2 1/2 1/2 1/2 

Section 2 as Yn = R^Xji/TjJ , where R M and are the positive definite 
square-roots of Rm and TV, respectively. Hence, the data matrix Yn is matrix- 
variate normal distributed, i.e., Yn ~ CMNaixn (Omxn, Rm, Tat), or cquiva- 
lently, vec (Yjv) ~ CMmn (Om,Rm <B> Tat) [18]. Moreover, in the case of an ar- 
bitrary positive definite matrix TV, we have that Y nY* n is a central quadratic 
form, such that E [YatY^] = Tt[Tat]Rm. Thus, in particular, if Tjv = Ijv 
then Y mY* n is central Wishart distributed, and we have E [Yn Y* N ] = NRm 
(see also, e.g., [19, Chapter 2]). We note that our spatio-temporal covariance 
model represents a non-trivial generalization of previous models, which is of 
interest for the signal processing and the applied statistics community. For in- 

1/2 

stance, the model in [20, 21] consisting of a data matrix Ym = R AI EV, where 
Rm = E [Yj/Yj f ] and 5V is a Gaussian matrix with standardized entries (i.e., 
with mean zero and variance one), is clearly a special case of our model. 

We recall that the previous distributional assumption is fairly standard in 
the array processing literature (e.g., [5, 6, 7], and [20, 21]). In particular, the 
Gaussianity assumption provides a means to obtain valuable approximations of 
the system performance by analytically characterizing the theoretical properties 
of otherwise intractable expressions of practical interest. On the other hand, 
the assumption of centered observations has minor impact, since observations 
can always be demeaned by extracting the sample mean. In fact, for Gaussian 
sample observations, the sample covariance matrix with and without estimation 
of the mean has the Wishart structure described above (with one degree-of- 
freedom less in the case of having to estimate the mean, which does not affect 
our asymptotic results). 

Before proceeding any further, we also notice that, thanks to the isotropic 
invariancc to orthogonal transformations of Gaussian matrices, the two corre- 
lation matrices Rm and T^r can be assumed to be diagonal without loss of 
generality. More specifically, using the fact that the distribution of a Gaussian 
matrix is unaffected by unitary transformations, it is easy to see that we can 
always write the SNR in (2.5) and (2.6) in terms of a unit-norm determinis- 
tic vector, a Gaussian matrix with standardized entries, and diagonal spatial 
and temporal covariance matrices. Such a parsimonious representation is more 
convenient for proving our statistical results, and is therefore preferred. 

We next introduce some notation that will be useful throughout the rest 

— 1/2 

of the paper. Let us first introduce the vector uj; = R M s and the matrix 
Q M = (^XjvTtvX^ + aR^ 1 ) , where a > 0. Moreover, we define 

o-M = UmQmum = u m Q m um, (3.1) 

along with 

a M = u* M ~E M u M b M = (1 -7m7m) _1 u m E m Um. (3.2) 
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where 7 = j M = ^ tr [E M ] and 7 = j M = jf tr 



E2 
AT 



with 



E 



R-A/ ^A/R-A/ + «I 



A/ 



Ejv = Tat (Ijv + #m Tat) 

and |<5m,i5a/| being the unique positive solution to the following system of 
equations: 



? 1 

om = tr 



JAI 



tr 



Tjv (Ijv 



<5a/T/v) 



R-A/ ( <^A/Ra/ + Oil 



A I 



(3.3) 



The existence and uniqueness of the solution to (3.3) follow by similar arguments 
as those in the proof of Proposition 1 in [9] . Additionally, notice that Em and 
Ejv are positive definite matrices. Before concluding, a final remark is in order. 
Under Assumption (As2), all previously defined elements are well defined for all 
M in the sense of the Euclidean norm for vectors or induced norm for matrices 
(see uniform bounds provided at the end of Appendix A, which will be useful 
for the derivation of our asymptotic results) . 



3.2. First-order approximations 

The following proposition provides asymptotic approximations for the expected 
values of the random variables am and 6jv/- The result follows readily from 
Proposition 1 and Proposition 2 in Section 4. 

Lemma 1. With the definitions and under the assumptions above, the following 
expectations hold: 

E [om] = a M + O (iV- 3/2 ) 
E M = hi + O (iv- 3/2 ) . 

Based on the previous approximation rules, we will consider the following 
two first-order estimates of the SNR under the supervised and the unsupervised 
training settings, namely, SNR(w ,mvdr) = a 2 M /b M , and SNR (wi, M vdr) = 
(Pm/o^j — l) , respectively. 



3.3. Second- order analysis 

The following two theorems establish the asymptotic Gaussianity of the fluc- 
tuations of the SNR performance measures (2.5) and (2.6). Before stating the 
results, we introduce the following quantity, which is shown to be positive in 
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Section 5 
V M = 7 



-< 2 -tr [E M 1 +1 2 -tr 
N A?" 



+ 4 7 (1 - 77 ) 5m + 4 ( 7 2 ^ tr [E M ] - 7^ tr 



(1 - 77) 




tr [E A/ ] 



1 



E 



A' 



T, 



M 



2 77^tr[E A/ ]-tr 



E3 
N 



+ 7 J 



tr 



E3 
JV 



where we have defined 

' U A/ E M U A/ 



Sm = 



u A/ -E A /u A / 



- 2- 



U A/ E A/ U A/ 1 

u Af E A /u A / 2 



U A/ E M U A/ 
U A/ E M U A/ 



u A/ E M u A/ 

U M E M U A/ 



Tm = 



U M E M U A/ U M E M U A/ 



U M E M U A/ 



Theorem 1. (Supervised Training) Under the definitions and assumptions in 
Section 3.1, the following CLT holds: 



a s,M v N ( SNR (w 0jM vdr) - SNR (w ,mvdr! 



AT (0,1) 



wher 



(u A/ E M U M )' 
U M E M U A/ 



Theorem 2. (Unsupervised Training) Under the definitions and assumptions 
in Section 3.1, the following CLT holds: 



°ZmvN SNR (wi.mvdr) - SNR (wi.mvdr) ) M (0, 1) 



where 



a 



u,M 



(u* u EmUmY 

U A/ E M U Af 



(u A/ E A/ u A/ ) 

U M E A/ U A/ 



(1 - 7m7m) 



The CLT's established in Theorems 1 and 2 state the intricate but explicit 
dependence on the spatial and temporal covariance matrices R A / and Tjv of the 
mean and variance of the realized SNR. In particular, notice that these two mo- 
ments univocally define the asymptotic Gaussian distributions derived above. 
Further insights can be gained by a scenario-based analysis considering partic- 
ular choices of the covariances R A / and T^. Though undoubtedly of practical 
relevance, such an analysis is outside of the scope of this work, and left open for 
future research. 

We remark that the previous analytical characterization of the asymptotic 
distribution of the SNR for a given, fixed diagonal loading parameter, could be 
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used for selecting an improved parameter. Previous work by one of the authors 
proposes a simple approach for fixing a by considering only a first-order asymp- 
totic analysis [4, 3]. Potential approaches exploiting the second-order asymptotic 
results provided here might be based on determining the diagonal loading factor 
maximizing not only the expected value of the realized SNR, but a linear combi- 
nation of the mean and the variance (i.e., the fluctuations). Given that now not 
only the variance but the whole distribution of the realized SNR is available, 
the previous proposed approach based on the first two moments could also be 
extended to the optimization of a given quantile by borrowing techniques from 
robust regression and robust statistics. This, again, is a far from trivial problem 
which deserves a line of research on its own. 

On a final note, we recall how the asymptotic analysis can shed some light 
on the convergence properties of the SNR, when the noise includes the con- 
tribution from interfering sources. Using a simplified version of Theorem 1 for 
time-uncorrclated sources, it was theoretically shown in [4, 3] that, in scenar- 
ios where interferences are much more powerful than the background noise, the 
minimum number of snapshots per antenna to achieve an output SNR within 
3dB of the optimum one becomes: i) N > 2K in the supervised case (compare 
with the classical TV > 2M of the rule proposed in [5]); ii) N > (2 + SNR opt ) K 
in the unsupervised case, where K is the dimension of the interference subspace. 
Hence, diagonal loading reduces the number of needed samples by approximately 
a factor of K/M (relative interference subspace dimension). 



4. Mathematical tools and preparatory results 



In this section, we introduce some mathematical tools and intermediate technical 
results that will be useful for the proof of the central limit theorems in Section 
3. In the sequel, we will denote by Zm £ ^mxai an( j ^ N £ £NxN se q uences f 
arbitrary diagonal nonrandom matrices with uniformly bounded spectral norm 
(in M and JV, respectively). Similarly, @ M g C MxM and 0^ g <C NxN will rep- 
resent sequences of positive definite nonrandom matrices having trace norm uni- 
formly bounded from above by finite scalars denoted, respectively, by ||0|| tr sup 

and || 1 , and trace operator uniformly bounded away from zero, i.e., 

I tr,sup 



{#inf, #inf j 







N 



111 



> 0, where inf = inf M >i tr [0m] and 9 inl = inf w >i tr 

particular, notice that ||0m||jt < ll®Af|lt r > an d so t ne Frobenius norm of ©m is 
also uniformly bounded. For instance, in the cases 0m = jj^m^m and 0m = 

1/2 

u M u* M , we have ||^Z^Z M || F = tr [(Z M Z M ) 2 ]) = O (N-^ 2 ) 

and ||umu m || f = ||um|| 2 = 0(1), respectively We remark that the positive 
definitcness of the matrices 0m and n only represents a purely technical as- 
sumption that will facilitate the proofs, but which can be relaxed to extend the 
results to the case of arbitrary not necessarily positive definite matrices. 

Next, we introduce some results that will represent a set of essential tools for 
the proof of Theorem 1 and Theorem 2. 
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4- 1. Gaussian tools 



We first briefly comment on the bounded character of the empirical moments 

of the spectral norm. Let p be a fixed integer and let j Z$ j , 1 < I < p, denote 

a set of p sequences of N x TV diagonal deterministic matrices with uniformly 
bounded spectral norm in N. Then, for p > 1, we have 



E 



X7(l) V 



JV 



N 



N 



N 



< Kn 



(4.1) 



The proof of (4.1) follows by first writing, using the submultiplicativc property 
of the spectral norm, 



E 



Xiy (1; Y 



XV (2) V 



TV 



< 



< 



I sup 



E 



N 



' J 1 



and then applying the following intermediate result. 



Lemma 2. Let Xjv € 



»MxN 



be a matrix having entries defined as i.i.d. Gaus- 



sian random variables with mean zero and variance one. Then, the following 
inequality holds for every q > I, i.e., 



sup E 

N>1 



X 



iV 



< +00. 



Proof. The proof is based on some well-known results about the concentration 
of Gaussian measures and its applications to random matrix theory (see, e.g., 
[22]). In particular, we build upon the following large deviation inequality for 
the largest singular value of a Gaussian matrix [23, Theorem 11.13], namely, 



x 



iV 



> t 



< 2exp [ - — 



(4.2) 



for any t > 0. Furthermore, for every non- negative random variable X, we have 
E [X] = J P (X > x) dx. Now, using the change of variables x = t q , dx = 
qt q ~ l dt, notice that E [X q ] = f™F(X q >x) dx = J °° P (X > t) qt^ 1 dt. 



Finally, letting X 



- [VII + Vn 



we get from (4.2) 



E 



X 



JV 



<2q 



t"' 



1 dt = 2 q ' 2 qT (|) 



< q q /*+\ 



where T (x) is the Gamma function, and we conclude that 



= K q + [N- 1 ' 2 



E 




Xw 


q- 






Vn 





□ 
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Indeed, if we let Xjy = -^X^ e ' + i ^X^- m \ where the matrices X^ and 

X^" 1 ^ are independently defined as the matrix Xjv, then, applying Jensen's 
inequality along with Lemma 2, we get 



E 



< 2 1 -/ 2 - 1 E 



N 



■jjr(wn) 



A' 



<K„, 



and (4.1) follows finally by taking q = 2p. 

We now introduce two further tools; with some abuse of notation, let T = 
r (Xjy, X|^) be a C 1 complex function such that both itself and its derivatives 
are polynomically bounded. Following the approach in [9] , in our proof of the 
CLT we will make intensive use of the Nash-Poincare inequality, i.e., 



M N 

var(F(X N ,X^))<^]T: 

i=l 3=1 



dr (x N ,x* N ) 



dX v 



dr (x N ,x* N ) 



dX, 



(4.3) 



where the upper bar denotes complex conjugation, as well as the integration by 
parts formula for Gaussian functionals, namely 



E[X ij T(X N ,X* N )]=E 



dT (Xw, X 

dXl 



N I 



(4.4) 



4-2. Variance controls and estimates of expected values 

Let us define the random variables 

= (X W ) = tr [& M Q k M ] , = (Xjy) = tr 

^ (4.5) 

where k is a finite positive integer. The proof of the following variance estimates 
essentially rely on the Nash-Poincare inequality in (4.3). 

Lemma 3. With all above definitions, the following variance controls hold: 



and 




Proof. See Appendix B. □ 

Also of particular use in our derivations will be the following approximation 
rules, whose proof has been postponed to Appendix C. 



©mQm 



XjvZjvX^ 
N 
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Proposition 1. With all above definitions, the following expectations hold, 
namely 



E 



^(Xjv) = tr[0E] + 



101 



AT3/2 



and 



E 



(X W ) 



Z (Ijv + (J^T) -1 tr[0E]+0 



101 



N 3 / 2 



Proposition 2. With all above definitions, the following expectations hold, 
namely 



E 



^(Xjv) = tr [0E 2 ] + 

J 1-77 



1 



11 







AT3/2 



and 

E 



= iv tr 



Z (Ijy + feT)" 1 ] tr [0E 2 ] 

J 1-77 



iV 



tr 



EZ (Ijv + SmT)- 1 } tr [0E] + O 

J I-77 



101 



TV 3 / 2 



Proposition 3. VFii/i all above definitions, the following expectations hold, 
namely 



E 



i 



(l-77) d V* 



1 



1 



7-tr[E 3 ] - 7 2 -tr E 3 tr [0E 2 ] 



f tr [0E 3 ] + O 

(1 - 77) 



101 



TV 3 / 2 



and 



E 



tr 



Z (I w + JmT)- 1 ! f 7 i tr [E 3 ] - 7 2 ^ tr 



E" 



N (I-77)' 



I-77 



tr 



ZE (Ijy + (Sa/T)" 1 ! f £ tr [E 3 ] - 7 3 i tr 



E 



N(l~llf 



1-11 



tr [0E 2 ] + tr [0E 3 ] 



tr [0E] + 7 tr [0E 2 ] 



1 



I -11 J N 



1 



tr 



ZE 2 (I w + SmTY 1 } tr [0E] + O 



\\®\\ F 

TV 3 / 2 
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Proposition 4. With all above definitions, the following expectation holds, 
namely 



3 



i 



(1 - 77 ) 3 



tr [0E 4 ] 



2 tr [0E 3 ] f . 1 



(1 - 77) 4 



+ 



tr [0E 2 ] 

(1 - 77) 4 



7 3 — tr 



E 3 

2tr [0E 2 ] 

(1 - 77) 5 



x S 7 



AT 



tr 



E 



1 



tr[E 3 ]) - 7 (l+ 77 )_tr E 3 - tr [E 3 ] 



1 



N 



+ o 



\\m F 



5. Elements of the proof of the asymptotic Gaussianity of the SNR 



Let us consider the real-valued random variable £m = AmVN (clm — o-m) + 
BmVN (pM — &m)j where aM,aM,bM,~b~M are defined in (3.1)-(3.2) and where 
Am and Bm are two real-valued nonrandom coefficients bounded above for 
all M by constants A sup and B sup , respectively. In particular, notice that if 



&m = umvl* m then we have $ 



(i) 

M 



a M and 



6a/, and also = am and 



6m- Wc begin this section by stating a theorem that establishes a CLT 



for the fluctuations of £m, and which will be instrumental in proving Theorem 
1 and Theorem 2. 



Theorem 3. Assume that [Am,Bm] is a deterministic real-valued vector whose 
norm is uniformly bounded above and below. Then, under (Asl — As3), the 
following CLT holds: 



Wa^lj (A U ,B M ) (A m (a M - %) + B M (b M - b M )) 4 Af (0, 1) , 



(5.1) 



where a 2 M (A M ,B M ) = [ A M B M ] £m [ A M B M ] , with £ M being a 
real-valued symmetric positive definite matrix having entries 1 = % )(I 2, 

[S M ] 2 ,2 = a M,b 2 , and [Em] 12 = [Sm] 2 i = o- M .ab = PM,ba, given by 



7 / * T7l2 \ 2 

CTM ' a2 _ 1 _ ryj \ U M iL 'M U M) 

0~M,ab = 0~M,ba 



27 



(1 - 77") 



2 U M E M U ^ U M E M U A/ 



(1 - 77) 3 



7 2 ^tr [E 3 ,] -7^tr 



N 



E3 
N 



(5.2) 



(5.3) 
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and 
o~M,b 2 



2^ 



73 U M E M u Af u M E M u M 



20 



(1 - 77) 

, 4u M E M U A/ u M E A/ U M J -2 



(1 - 77) 



u M E A/ u m) 



—3 \i\ M ni M 



(1 - 77) 4 



7 2 -tr [E M ]- 7 ^tr 



A r 



E3 
N 



( u m e m"m) 

(1 - 77) 4 



* al tr[ E y+ 7 2 ^tr 



N 



E iV 



2 ( u m e m u m) 



(1 - 77) 5 



x f 



AT 



tr[E 3 M ]) -2 7 7^tr[E 3 M ]-tr 



N 



E3 
N 



+ 7 J 



N 



tr 



E3 
N 



(5.4) 



Proof. Define (<*>) = exp(iw^M), and let E [^m (w)] be the characteristic 
function of £m- The proof of Theorem 3 is based on Levy's continuity theorem, 
which allows us to prove convergence in distribution by showing point-wise 
convergence of characteristic functions [24] . More specifically, similarly as in [9] , 
we study weak convergence to a Gaussian law by showing 



E [* M (w)] - exp -— <7 2 , M (A M , B m ) 



0. 



In particular, we show that 



d 



— E [* M (w)] = -wo^Af (A M , B m ) E [* m (w)] + i? w («) , 



(5.5) 



where i?jv (w) is an error term vanishing asymptotically as iV — > 00 uniformly 
in lu on compact subsets. In order to prove (5.5), we proceed by differentiating 
the characteristic function as 

^E[%(w)]=iE[e M % (w)] 

= i A M \/ATE [(a M - 5m) *m («)] + i B M y/NE [(b M - b M ) *m (w)] . 

The following proposition provides the computation of the expectation E [£m^m 
see Appendix D for a proof. □ 

Proposition 5. VFii/i the above definitions, the following expectations hold, 
namely 



NE {(a M - a M ) *M (w)] = + B<r at ) E [* («)] + (a^ 1 / 2 ) , (5.6) 



and 



VNE [(b M - b M ) *m (w)] = icj (Ax ba + B(T b 2)E[$ (w)] + O (Af- 1 / 2 



(5.7) 



Moreover, the term O (N 1 ' 2 ) depends neither on the coefficients Am and Bm 
nor on u, assuming that this last parameter takes values on a bounded interval. 
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Therefore, we have (recall that o~M,ab = &M,ba) 



+ B 2 M a Mi v)E[H< (u)]+0 [N- 1 

(5.8) 



1/2 



) 



Furthermore, a sufficient and necessary condition for the matrix to be 
positive definite is stated in the following proposition (see Appendix E for a 
proof). 

Proposition 6. Under the assumptions of Theorem 3, we have 



In order to complete the proof of Theorem 3, we need to show that the sequence 



is tight, and that every converging subsequence does it in distribution to a stan- 
dard Gaussian random variable. The proof of the previous two arguments relies 
on Proposition 6 and follows along exactly the same lines of that of Proposition 
6 in [9], and so we exclude it from our exposition. 

Remark 1. Theorem 3 can be used to characterize the fluctuations of the per- 
formance of optimal LMMSE or Wiener filters. Here, we particularly mean the 
classical statistical problem of estimating the signal s (n) in the linear signal 
model (2.1) with /3 — 1, by minimizing the Bayesian mean-square error (MSE) 
risk. Specifically, recalling that the MSE of a filter w is given by MSE (w) = 
1 — 2Re{w H s} |w fl Rw (see, e.g., [25, 26]), we notice that the asymptotic 
distribution of the MSE achieved by a sample implementation of the optimal 
filter wlmmse = R-m s based on the covariance matrix estimator (2.4), and 
denoted by wmse, can be readily obtained by simply applying Theorem 3 with 
Am = A mse ^M = 2 and E>m = B mse _M = 1 along with ®m = u^/u^ /7 so that 
we get the random variable: 

MSE (w M se) - MSE (w M se) = A mse . M (a M - a M ) + B ms ^ M (b M - b M ) ■ 

Related work on the study of the asymptotic Gaussianity of LMMSE receivers 
can be found in [27], where different techniques than used here based on the mar- 
tingale central limit theorem are considered without the assumption of Gaussian 
observations. We notice that the problem above relies on a covariance matrix 
which is unknown and therefore estimated, while in [27] the authors rely on a 
given model of the covariance matrix itself, whose structure is assumed to be 
known. 

We now complete the proof of Theorem 1 and Theorem 2 by showing that, 
similarly as in Remark 1, the asymptotic distribution of the SNR performance 
measure under both supervised and unsupervised training is given by Theorem 
1, for sensible choices of the coefficients Am and Bm- 



< inf cr| M (A M ,B M ) < sup <j\ m (A M ,B M ) < +oo. 



A/>1 
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5.1. Completing the proof of Theorem 1 and Theorem 2 

Let us define the following nonrandom coefficients: 

a M 



and 



A 2aM R 



2ambM D 
Ai,M — ~ —-j, £>u,M 



a M 



'M — a M 



(5.9) 



(5.10) 



which are bounded above and away from zero uniformly in M (cf. inequalities 
(A. 15) - (A. 19) in Appendix A). In particular, notice that 



A s ,m j4«, 



M 



2b 



M 



B s ,m B u> m aM 
Now, observe that wc can write 



(5.11) 



N (^SNRs (wmvdr) - SNRs (w M vdr) 

= VN (A SfM (a M - o M ) + B S>M (b M - 6m)) + £ s ,m, (5.12) 
and 

VN (SNRu (wmvdr) - SNRu (w M vdr)) = 

= VN (A u ,m (aM - a M ) + B U<M (bM - b M )) + £ u ,m, (5.13) 

where 

i— ( aM a, M \ , 

£ s ,m = vJV ^— b M , 

\ bM b M J 



Su,M = V N (aM - a M ) 

+ Vn 



aM 



M — a M 
a M 



a-M 



bM — a M b M - a 2 M , 

Next, we show that e s ^ M = o p (1) and s UtM = e^' M + e^' M = o p (1). Indeed, 
notice that we can write 



(h n 2 \ — J- 

(D M - a M ) = £ u,m + £ uj 



M- 



N 

\e.,m\ > £ ) ^ ^ E 









2 




a M 


a M 


\b M \ 




bM 


bM 



'N l 
2 e 



\a M - a M \ 



\b M \ 



?M 



■E 



| 6m - 6m | 



1 2 "I 
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where the last expression follows from Jensen's inequality. Let us further define 
the random variables Xm = 1/ (&m — a \i), an d Xi,m = Xm, <%2,a/ = a 2 M XM, 
Xz,m = o-mXm, Xa,m = Xm, along with the nonrandom coefficients Cm = 

4/ (&Af - ali) 2 , and Ci,Af = 1) C 3:A / = 2a\}C M , Cfc.M = a k M C M , k = 2,4. 
Then, we similarly have 

and (notice that (a| f - a A/ )^ = ( a M + 2omOm + (om — a a/) 2 ) 



pKL>0<^I^|e 



fe (2) 



>e < 



E 



aM 



'N 



bM - a 2 M b M - a 2 M 

' 4 



\x 



M om Om 



y>M — a M \ 



4— \ C k,M\ E \ x k,M\ Wm ~ a M \ 



Finally, using the bounds (A. 15) - (A. 19) in Appendix A to show that 

max sup {Cfc m} < +oo, 
i<fc<4 m>1 



and 



max sup \Xk m\ < +oo 

i<fc<4 M >i ' 



with probability one, together with Jensen's inequality and Propositions 1 and 
2, we conclude that both P(|e SiA /| > e) -> and P(|£„,m| > e) -> as -> oo. 
Hence, from (5.12) and (5.13) along with the fact that £ s ,m = °p (1) an d £ u ,m = 
o p (1), we conclude that the central limit theorems in Theorem 1 and Theorem 
2 follow by Slutsky's theorem and Theorem 1 with a 2 M and a 2 M being given 
by the quadratic form cr| M {Am, Bm), where the coefficients Am and Bm are 
given by (5.9) and (5.10), respectively. 



6. Numerical validation 

In this section, we compare the empirical distribution of the output SNR ob- 
tained by simulations with the corresponding analytical expressions derived in 
this paper. We considered a uniform linear array with elements located half a 
wavelength apart. The exploration angle was dcg. (desired signal), and the 
array received interfering signals from the angles —20, 50 and 55 degrees. All 
signals were received at each antenna with power lOdB above the background 
noise. In this toy example, the time correlation matrix was fixed to be a symmet- 
ric Toeplitz with its nth upper diagonal fixed to e~", n = 0, . . . , N — 1, and the 
diagonal loading parameter was fixed to a = 0.1. In Figure 1 and Figure 2, we 
represent the measured histogram (bars) and asymptotic law (solid curves) of 
the output SNR for different values of the parameters M, N, for both supervised 
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and unsupervised training, respectively. A total number of 10,000 realizations 
has been considered to obtain the empirical probability density function. In each 
figure, the upper plot corresponds to the case where the number of samples is 
lower than the number of antennas, whereas in the lower plot we depict the 
opposite situation. Observe that in both cases the asymptotic expressions give 
a very accurate description of the fluctuations of the output SNR, even for rel- 
atively low values of M, N. We also notice that the mismatch observed for very 
low dimensions is readily corrected by slightly increasing M and TV. 

7. Conclusions 

We have shown that the SNR of the diagonally loaded MVDR filters is asymp- 
totically Gaussian and have provided a closed-form expression for its variance. 
A CLT has been established for the fluctuations of the SNR performance of 
both supervised and unsupervised training methods. We resorted to the Nash- 
Poincare inequality and the integration by parts formula for Gaussian func- 
tional to derive variance and bias estimates for the constituents of the SNR 
measure. In fact, the same elements describe also the fluctuations of the mean- 
square error performance of this filter, which can be written in terms of realized 
variance and bias, as well as of other optimal linear filters, such as the Bayesian 
linear minimum mean-square error filter. The results hold for Gaussian obser- 
vations, but extensions based on a more general integration by parts formula 
can be investigated for non-Gaussian observations. 

Appendix A: Further definitions and useful bounds 

Throughout the appendices, we will use the following definitions, namely 



Let A and B denote two arbitrary square complex matrices. The following 
will be denoted in the sequel as resolvent identity, namely, A -1 — B 1 = 
A -1 (B — A)B _1 , where we have tacitly assumed the invertibility of A and 
B. In particular, using the previous resolvent identity, we notice that 




and also 



F 




Q = a _1 R q _1 Q— XTX'R. 



(A.1) 



Furthermore, we define 
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Now, we introduce some inequalities that will be extensively used in our 
derivations. First, let X and Y be two scalar and complex-valued random vari- 
ables having second-order moment. Then, we have 



var (X + Y) < var (X) + var (Y) + 2^/var (X) var (Y), (A.2) 

and also, from the Cauchy-Schwarz inequality, 

|E [{X - E [X]) Y]\ = \E [(X - E [X]) (Y - E [Y})}\ 

= |cov (X, Y) | < var 1 / 2 (X) var 1 / 2 (Y) . (A.3) 

Furthermore, we will be using [28, Chapter 3] 

|tr[AB]|<||AB|| tr <||A|| tr ||B||. (A.4) 

In particular, if A is Hcrmitian nonncgative, we can write 

|tr(AB)| < ||B||tr(A). (A.5) 

Moreover, we will also repeatedly use 

||AB|| F < ||A||||B|| F . (A.6) 

We further provide some inequalities involving the notation and elements 
dchncd in Section 3.1. In particular, the following inequality will be used in the 
proof of the variance controls given by Lemma 3: 

sup j|Q|| p < +oo, a.s. (A.7) 

M>1 

Indeed, using the fact that (-^XTX* + aR _1 ) > aR _1 a.s. 2 (i.e., the random 
matrix ^XTX* is semi-positive definite with probability one, having \M — N\ 

zero eingevalues), notice that ||Q|| < a -1 ||R|| < a" 1 ||R|| with probability 
one. 

From the previous inequalities, it also follows that 

sup tr [& m Q L m] < a- k ||R||^ ||0|| < +oo, a.s. (A.S) 

M>1 

The following two lemmas can be derived as in [9]. 
Lemma 4. The quantities 8m, 5m accept the following upper and lower bounds: 

Ki < 5 M < CsupOT 1 ||R|| sup , 

5 ini <5M<\\T\\ sup , 

where we have defined 

, _ c inf ||R|| inf - a||T|| inf 



|R|l S u P l|T|| sup a + c 8up ||R|| sup ||T|| sup 



2 almost surely 
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Similarly, observe that, 

1m < c sup a- 2 ||R|| s 2 up , 7m < ||T|| s 2 up . (A.9) 

Additionally, thanks to Jensen's inequality, the lower bounds on the quantities 
5m, 5m directly imply that 

1m > — Cf > 0, 7M > ~5f nl > 0. (A. 10) 

Csup 

Lemma 5. The quantity 1 — 7m7m accepts the following upper and lower 
bounds: 

i „ - ^ -i „ ll R llinf ll T llmf . -i 

1 - JM7M S 1 - "7 x— 7 r < 1 



«+||R|| sup j|T|| sup ) (a + Csup ||R|| sup ||T lls , 



and 

1 a 2 



1-7M7M > ; 2 4f 

C sup H-K-llsup 



In general, we have, for any finite k > 0, 



and also 



suptr[0 M Ey <^ fe iiRii: up n©ii sup < 

M>1 



sup tr 

N>1 



< II T| 







< +oo, 



SU]) 



foftr [0 M Ei] > 



,ta k IIRI 



inf 



c S u P ||R|| sup ||R|| inf 



inf tr 

N>1 



> 



af a fe ||T| 



inf 



Csup l|R|| sup l|T|| sup 



>o, 



> 0. 



(A.ll) 
(A.12) 

(A.13) 
(A.14) 



In particular, if &m = umii* m , then tr [&m] = || u m| ■ an( l using the fact 
that j|sj| 2 = 1, we notice that infj\/>i ||um|| 2 > ||R|| sup > and, additionally, 
sup M>1 ||uj| 2 < IIRIIjnf < +oo, so that it follows from the above inequalities 
that 



max sup {om^m} < +oo, a.s. 
M>\ 

max sup {5m,&m} < +oo, 

M>1 

min inf {om^a/I > 0. 
M>1 1 ' 



(A.15) 
(A.16) 
(A.17) 
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Moreover, observe that for a positive definite matrix A € C MxM the Cauchy- 
Schwarz inequality implies that (u*Au) 2 < u*A 2 u, for all M, and hence, using 
the bounds for 1 — Jm^/m above, 

sup — — < +00, a.s., (A. 18) 

m>i om — % 

sup = — -^o- < +00. (A.19) 
m>i om — a M 



Appendix B: Proof of Lemma 3 (Variance Controls) 

Wc first consider the quantities (X), k > 1. Using the Nash-Poincare in- 
equality in (4.3) and Jensen's inequality, we get by applying conventional dif- 
ferentiation rules for real- valued functions of complex matrix arguments (cf. [9, 
Section III]) along with the chain rule and after gathering terms together, 



var(<(X)) <fc^i-Etr 



r=l 



N 



Q r &Q 2(k r+1 '0*Q 



,xrx* 



N 



Q r 0*Q 



2(fc-r+l) 



0Q r 



,XT 2 X* 



N 



where wc have used 



0(X) 
dXu 



= eje. 



d(X* 



with e; being the unit norm vector whose zth entry is 1. Then, we further notice 
that, for any two constants p, q > 1, 



Etr 



Q p 0Q 29 0*Q 



,XT Z X* 



= E 



XT^X* 



N 



N 

\@Q p+q 



< 



xrx' 



N 



<ii©ir F E 



tr [Q p 0Q 29 0*Q p ] 

2(p+g) 



IQI 



XT 2 X* 



N 



<K\\&r F E 



XT 2 X* 



N 



O 1101 



where we have used inequalities (A. 6), (A. 7) and (4.1). 

(k) 

Wc finally consider the random variables ^ M (X), k > 1. By the Nash- 
Poincarc and Jensen's inequality, and similarly as in the previous case, we can 
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write 



(*)) 



k + 1 
< -^Etr 



©Q 2fc © 



XZZX* 



k + 1 
N 



tr 



Q fc 0*0Q A 



XZZX* 



A^ 



( fc + 1 )E^ Etr 



Q- X ^ X _0 Q 2(fc-+l)0 



A/ 



xz x* xrx 
14 



2V 



A r 



AT 



! tr 



Q'0* 



XZ X* 



N 



Q 



2(fc— r+l) 



K *0Q X 



A r 



A r 



Then, observe that, for any two constants p, q > 1, 



Etr 



_„XZX* „„^ 4 XZ*X* „XT 2 X* 
QP 0Q 2 <?0* QP. 



AT 
< E 



XT 2 X* 



N 



N 



tr 



A' 



< 



Q P ^0Q 2 <0*^lcy> 



< E 



XT^X* 



N 



xzx* 



A/ 



|Q|| 2p ||0Q 9 ||^ 



< K\\&\\ 2 F E 1/2 



XT 2 X* 



N 



E 1 - 



XZX* 



Ar 



O 1101 



where we have used the Cauchy-Schwarz inequality, along with the inequalities 
(A.6), (A.7) and (4.1). 



Appendix C: Proof of Propositions 1 to 4 (Expected value 
estimates) 

Let us start by studying the following quantity, namely 

1 N 

-^E[z]^[Q fc x iX; V (C.i) 

1=1 



E 



■ XZX 



N 
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Using the integration by parts formula in (4.4), we find that (ti = [T] ; ) 



M 



E[Q fe x i x ( *]..=^E[[Q fc ]. r X ri X ji 



l + t,4Etr[Q] l + t,iEtr[Q] 



E 



fe-i 



P =i 



tijrEtr [Q] 



[Q^xr] ij -tr[Q fe -^+ 1 ] 



, (C2) 



By plugging (C.2) into (C.f ), we obtain 



3 



Q fc 



xzx* 

N 



N 



tr 



fc-i 

P =i 



XZFX* 

N 



[ I w + -(Etr[Q])T 



E [Q* 



1 fc 
-Etr [Q fe -f +1 ]-^E 



9=1 



Q 



XZFX 



Furthermore, from the expression in (A.l), we observe that 

XTX 



E [Q%=a~ lE [Q fe ~% [R]^ — or 1 



E 



Q 



N 



fRl 



(C.3) 



(C.4) 



Then, by using in (C.4) the identity (C.3) with Z = T along with the defini- 



tion of the matrix E, i.e., [E] ■ = 
manipulations we get the following expression 



5 M I M + oiR 1 



, after some algebraic 



EfQ^.^EfQ^E], 



[Q fe E] 



k-l 

-E E 

P =i 



1 k 
-Etr [Q fc -f +1 ]+^E 



J y 



9=1 



fc _ g+1 XZFX E 



(C. 



In particular, from the expressions in (C.3) and (C.5), we obtain 

1 



lltr [Q fc ] =lEtr 



EQ 



fc-i 



5 At tr 

N 



k-l -. 



P =i 



tr 



EQ P 



XTFX 



iV 



^Etr [EQ 
iEtr[Q^]+^Af. 



(9) 
M ' 



(C.6) 



9=1 
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where we have defined the following error terms (here = j^Im) Z = E and 
Z = T): 



(?) 



A I 



Z0Q 



XZFX 



Before proceeding further, notice that (A. 3) along with Lemma 3 implies that, 
for any q > 1, 

' 1011 



*M = ° 



7V3/2 



(C.7) 



We now elaborate on (C.6) in the case k = 1. Specifically, note that we can 
write 



— Etr [Ql - — EtrfEl 



and so we get 



N 



N 



i_ Etr [Q]-ltr [E] 



EF 



1 

N 
FE 



tr [EQ] + O (N~ 2 ) 
1 



N 



Etr [EQ] + O (N~ 



-Etr[Q]--tr[E] l--tr 



-Etr[EQ]j =0(N~ 2 ) 



Moreover, using (A. 5), we observe that, uniformly in M, 



1 

— tr 

N 



EF 



-Etr [EQ] 



< 5 M ^Etx [Q] 



|E|| <1, 



which follows by Assumption (As2) from the fact that 



sup max 

M>1 



IN 



N 



Etr [Q]T 



< 1. 



Hence, we have 



N 

In particular, noting that 



_Etr[Q] = -tr[E] + 



N 



N- 



tr 



E F 



— Etr [Ql - — tr [El ) tr 



0FE 



< ||T|j sup , the next result 



together with ||0|| tr < vM||0|| F and sup M>1 
follows straightforwardly. 

Lemma 6. With all above definitions, the following approximation rule holds 



tr 



©F 



= tr 



0E 



O 



TV 3 / 2 



(C.8) 



Rubio, Mestre and Hachem/A CLT on the SNR of the DL beamformer 



27 



The variance control in (C.7) along with (C.8) imply that 



N 



Etr [Q A 



N 



-Etr 



EQ 



k-1 



k-1 



tr 



P =i 



XTFX 



JV 



lEtrtQ^j+ofl). (C.9) 



Similarly, we can write the following estimates from (C.5) and (C.3), respec- 
tively, 



E 



0Q 



xzx* 



N 



tr 



1 

' N 
fc-i 



I w + l(Etr [Q])T 



E 



©Q 



P =i 



XZFX 



iltr [Q fc -f +1 ] + O^ I|0||J 



AT3/2 



(CIO) 



and 



E 



©Q 



E 



fe-i 

p=i 



„ XTFX 

0Q P E 

AT 



lEtr[Q-^] +0 (»: 



(C.ll) 



Now, the proof of Propositions 1 to 4 can be now readily completed by han- 
dling the estimates (C.9), (C.10) and (C.ll), successively, following an iterative 
scheme from k = 1 to k = 4. 



Appendix D: Proof of Proposition 5 

We concentrate first on (5.6). Observing that 



E 



Q— XTX*R 

N 



*(w) 



it is sufficient to investigate the term E [Q-^XTX** Now, observe that 

we can express 



E 



Q_L X TX**K 



1 N 

= -2t,E[Qx I x l **( & 
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and therefore, using the integration by parts formula, we get after some algebraic 
manipulations 



E 



+ iaM-=E 



iu)B—==F, 



\.uB—==e 
Vn 



Q 2 uu*Q 



XTEX 



N 



-E 



J 'J 



a » XTEX* 
Q 3 uu*Q - E 



Q 2 uu*Q 2 - E 



— tr [Q] - S M 



N 



XTEX 
Q E 

N 



(D.l) 



Therefore, we can conclude that 



E [(a M - a M ) *m M] = i uA— =E 

v iv 



iwB 



u*Q 2 uu*Q - Eu$ (w) 



AT 



A r 



u*Q 3 uu*Q : 



XTEX* 



JV 



-Euf (w) 



u*Q 2 uu*Q 2 — Eu$ (w) 



+ 3^1, Af , 



where we have defined 



1,M 



tr [Q] - S M ) u*Q Euf (w) 



Hence, after some algebraic manipulations and the application of the variance 
controls in Lemma 3, we finally obtain 



ATE \(a M - a M ) *m (ui)] = iwAE [u*Q 2 u] 
+ iwBE [u*Q 3 u] E 



XTEX 
u Q 77 Eu 



E [* (w)] 



, XTEX 
u*Q - Eu 



A^ 



iw£?E [u*Q 2 u] E 



u*Q 



XTEX* 



N 



-Eu 



AT 

E [* (cj)] 
E[# (w)] + O (V^ 1 / 2 ) , 



Rubio, Mestre and Hachem/A CLT on the SNR of the DL beamformer 



29 



and (5.6) follows by Propositions 1 to 3. 
We now deal with (5.7). Observing that 



E 



[Q 2 L* 



Q 2 -XTX* 



*(w) 



[Rl 



3 ' 



we only need to investigate the quantity 



(D.2) 



E 



Q 2 — XTX* 



N 



N 



i=i 



J2m [Q a x,x?] *(c 



Thus, we can develop 



by using the integration by parts 



[Q 2 x,xj] y *(«) 

formula and applying similar algebraic manipulations as in the proof of (5.6). 
Then, using the previous estimate in (D.2), we can write 



E 



[Q 2 L*M =E [QE] .*(, 



1-77 



XTEX 
Q E 
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Consequently, we can finally state that 



E [(b M - b u ) *m (w)] = iuA-=E 

V N 
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u*Q 4 uu*Q 
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. 9 XTEX 

^ EU 



tr [Q 2 ] - 3- 



77 



,XTEX ^ 
u*Q^^Eu 



In particular, note that (A. 3) along with Lemma 3 implies that 3^2, a/ +3^3,m 
0{N~ 1 ). On the other hand, we also notice that 



E 
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where 



y^.M = e 



^tr [Q]-6 M j u*Q E 2 u* (w) 
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It can be trivially seen that D4,M = 0(N" 1 ). Furthermore, we readily see that 



E 



XTEX * 9 \ , s 

u*Q Eu - 7U*E 2 u * (oj) 



N 



, 1 
■ iu)A—=E 
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where the term E [(u*QEu — u*E 2 u) * (w)] has been examined above, and 
where 



y 6 . 



,1/ 



-E 



AT 



tr [Q] - J M u*Q 



XTE X* 



N 



Eu* (w) 



It can be readily seen that 3^5, m = 0(N 1 ). Inserting the above back into the 
original expression, we observe that 
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and the result follows readily by using Propositions 1 to 4. 



Appendix E: Proof of Proposition 6 

We will only prove the case Bm ^ 0, such that infjvf>i Bm = B ln f > (the 
complementary situation is much easier to handle). Consider first writing 



°£,m (Am,Bm) = 
where we have defined 



-Bm u m e 1/ u a/ 

(1 -77) 2 



Vm{Am,B m ) 



E 



V M (Am, B m ) = (fl tr [E^] + 7 2 ^ tr , 
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E3 
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tr[E 3 A/ ]) -2 7 7^tr[E 3 M ]-tr 



N 



E3 
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Tm (Am,Bm) 

21 
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iV 
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E3 
N 



Sm (Am, B 



1 



M 



A 
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1 (1 _ 77 ) 



, U A/ E M U ^/ 
u M E if u M 



U A/ E M U A/ 



U M E '1/ U M 
U A/ E M U M 



Tm (Am,B m ) 



Am , r, u M E M u -W 

uXf E ifUM 



(1-77)^ + 2 



By inequality (A. 11) along with Lemma 5 and the upper and lower bounds of 
Bm, we have that 

^Sjf u^E^ um/ (1 — 77) 2 ) is bounded uniformly above and away from zero. 
We now show that 



< inf V M (A m ,B m ) < sup V M (A M ,B M ) < +oo. (E.l) 

M>1 M >! 

Indeed, the upper bound in (E.l) follows readily by the triangular inequality 
and Lemma 5 along with inequalities (A. 9) and (A. 11) - (A. 13), together with 
the uniform upper and lower bounds ^4 S up and B m { of, respectively, Am and 
Bm- 
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In order to prove the lower bound, we first show that Sm > T M - Indeed, 
observe that we can write 



U A/ E A/ U A/ 
U A/ E A7 U A/ V u M E Af U M 



'm% u m 



1 U A/ E M U Af u M E A/ U A/ - ( U M E M U A/ 
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where the last statement follows by the Cauchy-Schwarz inequality. This shows 
that, by completing the squares 
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7 a/ (1 - 7m7m) 

Using this in the expression of Vm (Am, Bm) and grouping terms, we readily 
see that 
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which is a consequence of the Cauchy-Schwarz inequality (and cquivalcntly for 
Ejif instead of E M ), and this leads to 



Vm (Am, B m ) > 
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(1 - 7A/7A/) 7A/ 



7M ^tr [E 3 M ] ~7M^tr 
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Now, using again the Cauchy-Schwarz inequality we are able to write 
7 M = (1 tr [E M ] ) < i tr [E M ] ^ tr [E M ] = <5 M ^ tr [E 3 M ] , 



and this implies that 
7A/^tr [E M ] - 7M l t r 



E3 
M 



N 



E3 
M 



Jf tr [ e m] tr (ijv + s M T N y 



Therefore, we have shown that 



Vm(A m ,B m ) > 



\ti [E M ] Uti 



rp-l-p3 



(1 - 7m7m)7m 



and the lower bound in (E.l) finally follows from Lemma 5 together with in- 
equalities (A.9), (A. 13) and (A. 14). 
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Fig 1. Numerical evaluation of fitness accuracy of CLT (Supervised Training). 



Rubio, Mestre and Hachem/A CLT on the SNR of the DL beamformer 



38 




Fig 2. Numerical evaluation of fitness accuracy of CLT (Unsupervised Training). 



